Why are oscillations so fascinating for science? You will find an answer to that question (hopefully), as well as learn how to code with CoRK.
This is a teaching document. Whenever there is some code chunk, you can run it. But (as you will quickly see), it will not be completed. Some of the code will need to be put in by you to get to the next result. but don’t worry: We will guide you through everything. Don’t be shy to be creative with your coding - it is all learning by doing.
If you need help, click on the “Hint” tab. There we will give you some pointers on where to look, as well as some extra code guide you in the right direction. But it is always a good idea to use google as well if you don’t know how a certain function works. You can also use the intrinsic help-function of R: ?function gives you the documentation for that function. If you forgot the specific name, you can also use ??function for fuzzy search.
If you are completely lost, you can take a look at the “Solution” tab. Here we will give you the completed code with explanations on why it is working. Of course you can use the Solution tab to check if your result is correct. But don’t worry, if your code does not look like ours. If it achieves the same result, it is probably also true. But it can’t hurt to take a look at where your solution is different from ours and to try and understand, why we did what we did.
We assume that you already know how to work with CoRC a bit or have gone through our examples. Also it would help, if you have already some experience with ggplot, but that is not necessary.
This is an example on how our tabs work. You can click on the others to get help or take a look at our solution.
You successfully clicked on our “Hint” tab. I don’t think that you need help navigating these tabs.
This is our Solution tab. If you clicked on it and this message is showing, everything is working as it should. Now, on with the coding!
To set up our work with CoRC, we first need to install it and tell R that we will be using it. For this, we will call the library() function.
# Your code here
It is good practice to set up all libraries at the beginning of your document and not at the point where you first need it. This avoids confusion. So we will also need to tell R about the other packages we will be using: ggplot2 and plotly.
You need to call the library() function with the package you want to call as an argument.
You also need to make sure to have the package installed before calling it with the library function. If you are unsure if you have the required packages installed, you can check by using find.package(). If you have the package installed, it will tell you where it is installed.
find.package("CoRC")
## [1] "C:/Users/Johanna/Documents/R/win-library/4.0/CoRC"
You first need to install the packages CoRC and ggplot.
# #Install CoRC.
# install.packages("remotes")
# remotes::install_github("jpahle/CoRC")
# CoRC::getCopasi()
#
# #Install ggplot
# install.packages("ggplot2")
# #Install plotly
# install.packages("plotly")
Now, we want to call both of these packages with the library function:
library(CoRC)
library(ggplot2)
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
And done! We set up our worksheet :)
This implementation is an example from the Condor-Copasi paper1! You can find the model here:
loadModel("https://raw.githubusercontent.com/copasi/condor-copasi/master/examples/stochastic_test.cps")
## # A COPASI model reference:
## Model name: "Kummer calcium model"
## Number of compartments: 1
## Number of species: 3
## Number of reactions: 8
Make yourself familiar with the model and take a look at the species and the reactions.
To inspect an element with CoRC you use the get family of functions.
We want to look at the species and the reactions. To look at the species we type:
getSpecies()
## # A tibble: 3 x 13
## key name compartment type unit initial_concent~ initial_number
## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 a{co~ a compartment reac~ nmol~ 8 482.
## 2 b{co~ b compartment reac~ nmol~ 8 482.
## 3 c{co~ c compartment reac~ nmol~ 8 482.
## # ... with 6 more variables: concentration <dbl>, number <dbl>, rate <dbl>,
## # number_rate <dbl>, initial_expression <chr>, expression <chr>
and to look at the reactions
getReactions()
## # A tibble: 8 x 6
## key name reaction rate_law flux number_flux
## <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 (R1) R1 " -> a" FunctionDB.Functions[Constant flux (~ NaN NaN
## 2 (R2) R2 " -> a; ~ FunctionDB.Functions[linear activati~ NaN NaN
## 3 (R3) R3 "a -> ; ~ FunctionDB.Functions[Irr Michaelis-M~ NaN NaN
## 4 (R4) R4 "a -> ; ~ FunctionDB.Functions[Irr Michaelis-M~ NaN NaN
## 5 (R5) R5 " -> b; ~ FunctionDB.Functions[linear activati~ NaN NaN
## 6 (R6) R6 "b -> " FunctionDB.Functions[Henri-Michaelis~ NaN NaN
## 7 (R7) R7 " -> c; ~ FunctionDB.Functions[linear activati~ NaN NaN
## 8 (R8) R8 "c -> " FunctionDB.Functions[Henri-Michaelis~ NaN NaN
We want to show how species “a” behaves over time. For this, we first have to run a Time course.
timecourse <- runTimeCourse()$result
timecourse
## # A tibble: 801 x 4
## Time a b c
## <dbl> <dbl> <dbl> <dbl>
## 1 0 8.00 8.00 8.00
## 2 0.05 6.56 8.24 6.13
## 3 0.1 5.63 8.10 2.77
## 4 0.15 5.33 8.04 0.332
## 5 0.2 5.60 8.14 0.0996
## 6 0.25 5.98 8.04 0.116
## 7 0.3 6.19 7.95 0.448
## 8 0.35 6.74 8.04 0.316
## 9 0.4 7.42 8.12 0.299
## 10 0.45 7.92 8.15 0.731
## # ... with 791 more rows
Fill in the missing code!
# ggplot(data = ) +
# geom_line(aes(x = , y =))
(You have to get rid of the # to uncomment and run the code. If you mark several lines with your pointer and press ctrl+shift+c you can comment/uncomment several lines at once.)
You have to think about what is missing here.
There are some equal signs that only have one side filled out.
data, x, and y are part of what we want to plot.
data is the datatable ggplot draws its data from. We have created a timecourse table that we want to plot, so this is our data.
x is the x-axis of our plot (the time) and y the y-axis (concentration of a). We find both in our datatable that we provided with data.
ggplot(data = timecourse)+
geom_line(aes(x = Time, y = a))
Now, can you do the same for b and c?
#Your code here
Take a look at the code for visualizing a and think about what is the same and what is different for b and c.
What we have to change about the code form above is the part telling ggplot what we want to have on our y axis. In the last plot, we wanted a there and now we want b or c.
For b this will look like this:
ggplot(data = timecourse) +
geom_line(aes(x = Time, y = b))
Of course, we can put all species in one plot. To see which species are which, we will also color the lines. We can tell ggplot implicitly to color “a” different than “b” by calling the color “a” or “b”. Of course you can also explicitly tell which color to use by saying “color = ‘red’” (after the aes()command) for example.
ggplot(data = timecourse) +
geom_line(aes(x = Time, y = a, color = "a")) +
geom_line(aes(x = Time, y = b, color = "b")) +
geom_line(aes(x = Time, y = c, color = "c"))
Now, the question presents itself: How and why does this model oscillate? And why does it matter?
To answer the last question first, oscillators are fascinating because a lot of fascinating signalling pathways in biology oscillate. You can think about why that might have some sort of evolutionary advantage. What is in our environment that changes over time in an oscillating way?
The first thing that might come to mind is our day-and-night-cycle. And indeed, we find oscillating signalling pathways here. Even without external stimuli these oscillation prevail. For example, one of biologist’s favourite pet, the fruit fly Drosophila melanogaster has the proteins PER (short for period) and TIM (short for timeless) that act as their “clock” (Tyson, 1999)2
The fist question is a bit harder to answer.
Take a look at our oscillating model below:
This is a representation of a so called predator-prey model from Lotka-Volterra 3 We will build the model in the following tasks.
We call the elements in our model “species”. Here, this is even more true. We will have two species, “Cats” and “Mice”.
We will first have to create a new model and then our two species.
#Your code here
To create something new, we use the new family of functions. Just try it, RStudio tries to autofill your functions, so you might find what you need like that.
#First, make a new model
newModel()
## # A COPASI model reference:
## Model name: "New Model"
## Number of compartments: 0
## Number of species: 0
## Number of reactions: 0
#Then two new species
newSpecies("Cats")
## Warning: No compartment exists. Created default compartment.
## [1] "Cats{compartment}"
newSpecies("Mice")
## [1] "Mice{compartment}"
We will want to create 4 functions:
Remember, Cats need Mice to multiply!
#newReaction()
We want to create a reaction for breeding and one for dying. For mice this might look like this:
#Cats being born:
newReaction("Mice -> 2 * Mice")
## [1] "(Mice -> 2 * Mice)"
#Cats dying:
newReaction("Mice ->")
## [1] "(Mice ->)"
For the Cats it looks similar, but remember, Cats need to eat.
When creating a reaction, we need to use the newReaction() function.
For the “dying”-functions, we tell CoRC that there was something there and then it isn’t there anymore:
newReaction("Cats -> ")
## [1] "(Cats ->)"
newReaction("Mice -> ")
## [1] "(Mice ->)"
For more mice we need mice in the first place, so a similar " -> Mice" won’t work, even if this is possible in CoRC. This would be the case if we have a constant migration of mice into our space. Instead we will multiply mice and cats, while cats also need to eat mice in order to multiply:
newReaction("Mice -> 2 * Mice")
## [1] "(Mice -> 2 * Mice)"
newReaction("Cats + Mice -> 2 * Cats")
## [1] "(Cats + Mice -> 2 * Cats)"
Because we are dealing in concentrations (in our example that would be mice / square meter or something similar) we can say that one mouse can create two mice, even if that doesn’t necessarily make sense. # {-}
If we take a look at our model now
we see, that is does not yet behave in a way we want it to. This is because we still need to take a look at the parameters of our model.
Most obviously we can see that it probably does not make sense to start with our two species at the same concentration. In nature we find that usually there is an overabundance of prey over predators, otherwise at some point we would not find any more mice.
So let us fix that: We can take a look at our species like this:
getSpecies()
## # A tibble: 2 x 13
## key name compartment type unit initial_concent~ initial_number
## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 Cats~ Cats compartment reac~ mol/l 1 6.02e23
## 2 Mice~ Mice compartment reac~ mol/l 1 6.02e23
## # ... with 6 more variables: concentration <dbl>, number <dbl>, rate <dbl>,
## # number_rate <dbl>, initial_expression <chr>, expression <chr>
and then set the value we want with the set command. For the key argument in the setSpecies() function we want all keys available, because we want to set all species. We can do this like this:
setSpecies(key = getSpecies()$key, initial_concentration = c(0.1, 10))
We can do something similar for our parameters.
# Take a look at the parameters and change them to what you think is a good value.
First, we need to look at our parameters. Try doing that similarly how we looked at the species. The setting of the parameters also similar to how we set our species.
For which value to use, think about how fast reactions should be compared to the others.
First, let us take a look at our parameters.
getParameters()
## # A tibble: 4 x 5
## key name reaction value mapping
## <chr> <chr> <chr> <dbl> <chr>
## 1 (Cats ->).k1 k1 Cats -> 0.1 <NA>
## 2 (Mice ->).k1 k1 Mice -> 0.1 <NA>
## 3 (Mice -> 2 * Mice).k1 k1 Mice -> 2 * Mice 0.1 <NA>
## 4 (Cats + Mice -> 2 * Cats).k1 k1 Cats + Mice -> 2 * Cats 0.1 <NA>
We can see that all parameters have the same value. This does not really make sense. We can assume that Mice probably multiply faster than cats; and that cats have a longer lifespan than mice. So we will set our parameters to 0.01 for Cats dying and 0.1 for Cats multiplying as well as 0.2 for Mice multiplying and 0.02 for mice dying.
We can set these values similarly to how we set the species:
setParameters(key = getParameters()$key, value = c(0.01, 0.02, 0.2, 0.1))
If you do not understand how we can use this expression to set the key argument, try running it alone.
We can now take a look at how our model behaves. Try making the plot yourself.
# Make a plot
We already made a plot above. We need to run a timecourse to get our points for the plot and then use ggplot for plotting. This time, we also need colors; so take a look at the color argument of the ggplot aes() function, where we specified x and y.
First, we will run a timecourse. We set the duration of the timecourse to 1000
timecourse <- runTimeCourse(duration = 1000)
If you take a look at the timecourse variable by typing the variable in your console we can see that it is a list of a lot of different points. We are only interested in the result part. You can subset lists in R with the $ operator.
timecourse <- timecourse$result
This is now a table of our species, but ggplot needs all values to be in one column and any differentiation between them (in our case the species and time) in another column. This is also called the long format of tables. We can use the pivot_longer() command to easily transfer all we need. Time is already in a seperate column, so we do not need to do anything with it. We can do that with -Time. We use the names_to argument to specify how we would like to call the column where the names for the old columns should go; the values_to argument does the same but for values.
timecourse_long <- pivot_longer(data = timecourse, -Time, names_to = "Species", values_to = "Concentration")
To get a better understanding what the pivot_longer() function does, you can take a look at the two tables: timecourse and timecourse_long.
Now we have everything we need for ggplot. We call the ggplot() function. As data, we need our prepared timecourse, timecourse_long. On the x-axis we want the time and on the y-axis we want the values (concentrations). To differentiate between our species, we also need color.
ggplot(data = timecourse_long, aes(x = Time, y = Concentration, color = Species))+
geom_line()
# {-}
And we just made our own, oscillating model! But why and how does our model oscillate?
In a helpful review from Novák and Tyson 4 we find the following requirements for oscillators:
We can try and explain how our model oscillates by going through those point by point.
We will now take a look at another model. This model is about oscillaitons in calcium signalling (Kummer, 2000) 5. This model has three species: G-alpha, activePLC and Calcium. All three of these species can oscillate, so we can visualize this in a 3D-plot.
But first, we need to load our model and create a timeseries. We will use a duration of 24, which is 2 Periods with 10000 intervals.
# load this sbml-file:
#"https://www.ebi.ac.uk/biomodels/model/download/BIOMD0000000329?filename=BIOMD0000000329_url.xml"
#run timecourse
To load an SBML file we use loadSBML(). The timecourse needs two arguments, duration and intervals
First, we need to load the SBML file:
loadSBML("https://www.ebi.ac.uk/biomodels/model/download/BIOMD0000000329?filename=BIOMD0000000329_url.xml")
## # A COPASI model reference:
## Model name: "Kummer2000 - Oscillations in Calcium Signalling"
## Number of compartments: 1
## Number of species: 3
## Number of reactions: 8
You can take a look at this model using the getSpecies() and getReactions() function.
The timecourse has two arguments: duration needs to be set to 24 and intervals to 10000.
timecourse <- runTimeCourse(duration = 24, intervals = 10000)$result
We will use the plotly package to make a plot, because ggplot2 does not support 3d graphics. With our plotly solution you can also interact with the plot in the browser.
det <- plot_ly(data = timecourse,
type = "scatter3d",
mode = "lines",
x = ~ `G-alpha`,
y = ~ activePLC,
z = ~ Calcium,
color = ~ Time)
det
## Warning: `arrange_()` is deprecated as of dplyr 0.7.0.
## Please use `arrange()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
So far, we have only used deterministic modeling. But we can also run a timecourse stochastically! For this, we need the argument method.
# Run time course
# tc_stochastic <- timecourse(duration = 24, intervals = 10000, method = )
# Make plotly plot
The method needed is called *“directMethod”. For the plot our code will look very similar to before.
First, we run the time course. Here, we will use a random seed so whoever uses the code will get the same solution. But try it without the random seed serveral times and see, if you get different solutions!
tc_stochastic <- runTimeCourse(duration = 24, intervals = 1000, method = list(method = "directMethod",
use_random_seed = TRUE,
random_seed = 1))$result
Now, we will create another plotly-plot:
stoc <- plot_ly(data = tc_stochastic,
type = "scatter3d",
mode = "lines",
x = ~ `G-alpha`,
y = ~ activePLC,
z = ~ Calcium,
color = ~ Time
)
stoc
As we can see the two cycles for the deterministic solution do not really look all that different from each other. For the stochastic solution on the other hand, we can really see that a different random seed, or just letting the cycle run twice really changes how our solution looks.
We can now take a look at our model in a different way. What do you think will happen, if we plot the two species of our model - one on the x-axis and one on the y-axis:
What we see here we can call a limit cycle. Stable limit cycle oscillations will always return to the limit cycle, no matter the starting conditions. Obviously, this is very useful for keeping a system in balance.
We will try a parameter scan over different starting conditions and plot these:
We want 10 different conditions for starting values of our species. We want them normally distributed around the current value for each species with a standard deviation of 15 % of the current value
scan_values <- tibble(Galpha = c(), activePLC = c(), Calcium = c())
Afterwards, we want to run a timecourse for each of the ten combinations in scan_values.
# run 10 time courses
And lastly, we want to plot everything.
# plot
You can generate random numbers that are normally distributed using rnorm.
You can use a for-loop to run your timecourses. Make sure you set your species before each timecourse.
The plotting part should look similar to above
First, we want to make 10 starting conditions. These should be normally distributed with a mean of the current initial concentration and a standard deviation of 15 % of the current initial concentration.
We will first generate a vector of our initial concentrations, called species and a vector of our standard deviation sd by multiplying species with 0.15 .
species <- getSpecies()$initial_concentration
sd <- species * 0.15
We can now create our scan_value tibble by using the rnorm function. rnorm() creates n random numbers, normally distributed around a mean and with a standard deviation sd. We will now fill our tibble with our three species:
scan_values <- tibble(Galpha = rnorm(10, mean = species[1], sd = sd[1]),
activePLC = rnorm(10, mean = species[2], sd = sd[2]),
Calcium = rnorm(10, mean = species[3], sd = sd[3]))
We will first run a time course to append to and afterwards run 10 time courses with different inital concentrations for our species in a for-loop.
The for loop runs 10 times, and each time the species will be changed and another timecourse run and appended to the ones before. To keep track of which round we are in, we append a column nr.
tc <- runTimeCourse(duration = 24, intervals = 100)$result %>% add_column(nr = rep(0, 101))
for (scan in 1:10){
setSpecies(key = getSpecies()$key, initial_concentration = unlist(scan_values[scan,]))
tc_new <- runTimeCourse(duration = 24, intervals = 100)$result %>% add_column(nr = rep(scan, 101))
tc <- add_row(tc, tc_new)
}
The plotting works the same as before. But instead of time, this time we want to color in our run (nr).
ggplot()+
geom_line(data = tc, aes(x = activePLC, y = Calcium, color = nr))
We can also take a look at how that looks in 3d:
multiple3d <- plot_ly(data = tc,
type = "scatter3d",
mode = "lines",
x = ~ `G-alpha`,
y = ~ activePLC,
z = ~ Calcium,
color = ~ nr
)
multiple3d
Kent, Edward, Stefan Hoops, and Pedro Mendes. “Condor-COPASI: high-throughput computing for biochemical networks.” BMC systems biology 6.1 (2012): 91.↩︎
Tyson, John J., et al. “A simple model of circadian rhythms based on dimerization and proteolysis of PER and TIM.” Biophysical journal 77.5 (1999): 2411-2417.↩︎
HIER BRAUCHE ICH NOCH EINE QUELLE!↩︎
Novák, Béla, and John J. Tyson. “Design principles of biochemical oscillators.” Nature reviews Molecular cell biology 9.12 (2008): 981-991.↩︎
Kummer, Ursula, et al. “Switching from simple to complex oscillations in calcium signaling.” Biophysical journal 79.3 (2000): 1188-1195.↩︎